Effects of Related Term Extraction in Transliteration into Chinese

نویسندگان

  • HaiXiang Huang
  • Atsushi Fujii
چکیده

To transliterate foreign technical terms and proper nouns, in Japanese and Korean, phonograms, such as Katakana and Hangul, are used. In Chinese, the pronunciation of a source word is spelled out with Kanji characters. However, because Kanji comprises ideograms, different Kanji are associated with the same pronunciation, but can potentially convey different meanings and impressions. In this paper, we propose a method to select characters in transliteration into Chinese using related terms extracted from the World Wide Web. We show the effectiveness of our method experimentally.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporating Pronunciation Variation into Extraction of Transliterated-term Pairs from Web Corpora

A novel approach to automatically extracting transliterated-term pairs from Web corpora is proposed in this paper. One of the most important issues addressed is that of taking pronunciation variation into account. Pronunciation variation is a phenomenon of pronunciation ambiguity that seriously affects the term transliteration and hence affects those results produced by transliteration processe...

متن کامل

Semi-Supervised Lexicon Mining from Parenthetical Expressions in Monolingual Web Pages

This paper presents a semi-supervised learning framework for mining Chinese-English lexicons from large amount of Chinese Web pages. The issue is motivated by the observation that many Chinese neologisms are accompanied by their English translations in the form of parenthesis. We classify parenthetical translations into bilingual abbreviations, transliterations, and translations. A frequency-ba...

متن کامل

Improving Translation of Unknown Proper Names Using a Hybrid Web-based Translation Extraction Method

Recently, we have proposed several effective Web-based term translation extraction methods exploring Web resources to deal with translation of Web query terms. However, many unknown proper names in Web queries are still difficult to be translated by using our previous Web-based term translation extraction methods. Therefore, in this paper we propose a new hybrid translation extraction method, w...

متن کامل

Chinese-to-English Backward Machine Transliteration

It is challenging to transliterate named entities across languages. It is even more challenging to backward transliterate the transliterated term into its original form. This paper addresses the problem of backward translating person name from Chinese to its English counterpart. We propose a statistical backward transliteration method. Our method uses English sub-syllable and Chinese syllable a...

متن کامل

Improving Translation of Queries with Infrequent Unknown Abbreviations and Proper Names

Unknown term translation is important to CLIR and MT systems, but it is still an unsolved problem. Recently, a few researchers have proposed several effective search-result-based term translation extraction methods which explore search results to discover translations of frequent unknown terms from Web search results. However, many infrequent unknown terms, such as abbreviations and proper name...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008